11 research outputs found

    A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation

    Get PDF
    Applying a binary mask to a pure noise signal can result in speech that is highly intelligible, despite the absence of any of the target speech signal. Therefore, to estimate the intelligibility benefit of highly nonlinear speech enhancement techniques, we contend that SNR is not useful; instead we propose a measure based on the similarity between the time-varying spectral envelopes of target speech and system output, as measured by correlation. As with previous correlation-based intelligibility measures, our system can broadly match subjective intelligibility for a range of enhanced signals. Our system, however, is notably simpler and we explain the practical motivation behind each stage. This measure, freely available as a small Matlab implementation, can provide a more meaningful evaluation measure for nonlinear speech enhancement systems, as well as providing a transparent objective function for the optimization of such systems

    Model-based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids

    Get PDF
    Speech intelligibility is often severely degraded among hearing impaired individuals in situations such as the cocktail party scenario. The performance of the current hearing aid technology has been observed to be limited in these scenarios. In this paper, we propose a binaural speech enhancement framework that takes into consideration the speech production model. The enhancement framework proposed here is based on the Kalman filter that allows us to take the speech production dynamics into account during the enhancement process. The usage of a Kalman filter requires the estimation of clean speech and noise short term predictor (STP) parameters, and the clean speech pitch parameters. In this work, a binaural codebook-based method is proposed for estimating the STP parameters, and a directional pitch estimator based on the harmonic model and maximum likelihood principle is used to estimate the pitch parameters. The proposed method for estimating the STP and pitch parameters jointly uses the information from left and right ears, leading to a more robust estimation of the filter parameters. Objective measures such as PESQ and STOI have been used to evaluate the enhancement framework in different acoustic scenarios representative of the cocktail party scenario. We have also conducted subjective listening tests on a set of nine normal hearing subjects, to evaluate the performance in terms of intelligibility and quality improvement. The listening tests show that the proposed algorithm, even with access to only a single channel noisy observation, significantly improves the overall speech quality, and the speech intelligibility by up to 15%.Comment: after revisio

    Model based Binaural Enhancement of Voiced and Unvoiced Speech

    Get PDF

    Pitch-based non-intrusive objective intelligibility prediction

    Get PDF

    Experimental Study of Generalized Subspace Filters for the Cocktail Party Situation

    Get PDF

    The He-rich core-collapse supernova 2007Y: Observations from X-ray to Radio Wavelengths

    Get PDF
    A detailed study spanning approximately a year has been conducted on the Type Ib supernova 2007Y. Imaging was obtained from X-ray to radio wavelengths, and a comprehensive set of multi-band (w2m2w1u'g'r'i'UBVYJHKs) light curves and optical spectroscopy is presented. A virtually complete bolometric light curve is derived, from which we infer a (56)Ni-mass of 0.06 M_sun. The early spectrum strongly resembles SN 2005bf and exhibits high-velocity features of CaII and H_alpha; during late epochs the spectrum shows evidence of a ejecta-wind interaction. Nebular emission lines have similar widths and exhibit profiles that indicate a lack of major asymmetry in the ejecta. Late phase spectra are modeled with a non-LTE code, from which we find (56)Ni, O and total-ejecta masses (excluding He) to be 0.06, 0.2 and 0.42 M_sun, respectively, below 4,500 km/s. The (56)Ni mass confirms results obtained from the bolometric light curve. The oxygen abundance suggests the progenitor was most likely a ~3.3 M_sun He core star that evolved from a zero-age-main-sequence mass of 10-13 M_sun. The explosion energy is determined to be ~10^50 erg, and the mass-loss rate of the progenitor is constrained from X-ray and radio observations to be <~10^-6 M_sun/yr. SN 2007Y is among the least energetic normal Type Ib supernovae ever studied.Comment: Corrected error in Tab. 2 & 3. Photometry has not change

    Binaural speech enhancement using a codebook based approach

    No full text

    An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech

    No full text
    Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate in cases where noisy speech is processed by a time-frequency weighting. To this end, an extensive evaluation is presented of objective measure for intelligibility prediction of noisy speech processed with a technique called ideal time frequency (TF) segregation. In total 17 measures are evaluated, including four advanced speech-intelligibility measures (CSII, CSTI, NSEC, DAU), the advanced speech-quality measure (PESQ), and several frame-based measures (e.g., SSNR). Furthermore, several additional measures are proposed. The study comprised a total number of 168 different TF-weightings, including unprocessed noisy speech. Out of all measures, the proposed frame-based measure MCC gave the best results (qÂĽ0.93). An additional experiment shows that the good performing measures in this study also show high correlation with the intelligibility of single-channel noise reduced speech.MediamaticsElectrical Engineering, Mathematics and Computer Scienc
    corecore